In this notebook it's presented a focus on a particular Strava activity, in this case a ride: the trajectory is analyzed to better understand how the journey has been carried and which peculiarities it featured (if any).
This part is common for all notebooks, for simplicity. It takes all the Strava activities collected and stores them in lists, dataframes, geodataframes and trajectories for every type of activity (runs, hikes, rides and all).
# to avoid geopandas warnings, don't run if you don't mind warnings
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import matplotlib.pyplot as plt
import gpxpy
from geopy.geocoders import Nominatim
import os
from tqdm import tqdm
import utils
activities = []
runs = []
hikes = []
rides = []
for file in tqdm(os.listdir("data/strava_activities/")):
gpx_file = open("data/strava_activities/{}".format(file), 'r')
gpx = gpxpy.parse(gpx_file)
if((gpx.tracks[0].type == 'Run') | (gpx.tracks[0].type == 'running') ):
runs.append(gpx)
activities.append(gpx)
elif(gpx.tracks[0].type == 'Hike'):
hikes.append(gpx)
activities.append(gpx)
elif(gpx.tracks[0].type == 'Ride'):
rides.append(gpx)
activities.append(gpx)
100%|██████████| 159/159 [00:16<00:00, 9.82it/s]
print("Total number of activities: \t{}".format(len(activities)))
print("Total number of runs: \t\t{}".format(len(runs)))
print("Total number of rides: \t\t{}".format(len(rides)))
print("Total number of hikes: \t\t{}".format(len(hikes)))
Total number of activities: 152 Total number of runs: 135 Total number of rides: 3 Total number of hikes: 14
Which are the longest activities in term of time and length? Let's pick the top 5 for both aspects.
utils.getTopLongestTravelTime(activities, 5)
print("-----------------------------")
utils.getTopLongestTravel(activities, 5)
print("-----------------------------")
utils.getTopElevationDifference(activities, 5)
1° longest activity duration: 10:26:45, activity n° 91, type: Ride 2° longest activity duration: 09:37:45, activity n° 125, type: Hike 3° longest activity duration: 08:33:00, activity n° 15, type: Hike 4° longest activity duration: 07:59:45, activity n° 121, type: Hike 5° longest activity duration: 05:19:14, activity n° 7, type: Hike ----------------------------- 1° longest activity length: 102401.12, activity n° 91, type: Ride 2° longest activity length: 35057.82, activity n° 5, type: Ride 3° longest activity length: 25111.91, activity n° 45, type: Ride 4° longest activity length: 21448.31, activity n° 109, type: Run 5° longest activity length: 19784.85, activity n° 15, type: Hike ----------------------------- 1° highest elevation difference: 1489.0, activity n° 15, type: Hike 2° highest elevation difference: 1368.6, activity n° 7, type: Hike 3° highest elevation difference: 1266.0, activity n° 121, type: Hike 4° highest elevation difference: 969.6, activity n° 115, type: Hike 5° highest elevation difference: 961.6, activity n° 84, type: Hike
# list of dataframes of all the activities
activities_dfList = utils.toList(activities)
runs_dfList = utils.toList(runs)
rides_dfList = utils.toList(rides)
hikes_dfList = utils.toList(hikes)
100%|██████████| 152/152 [00:03<00:00, 47.61it/s] 100%|██████████| 135/135 [00:02<00:00, 58.53it/s] 100%|██████████| 3/3 [00:00<00:00, 17.46it/s] 100%|██████████| 14/14 [00:01<00:00, 12.23it/s]
Let's check start and arrival points of my longest trip, the activity n° 91.
start = activities_dfList[91][:1]
end = activities_dfList[91][-1:]
latlon = str(start.latitude[0]) + "," + str(start.longitude[0])
geolocator = Nominatim(user_agent="geospatial course unitn")
starting_place = geolocator.reverse(latlon)
latlon = str(end.latitude.values[0]) + "," + str(end.longitude.values[0])
geolocator = Nominatim(user_agent="geospatial course unitn")
arrival_place = geolocator.reverse(latlon)
print("Starting place: {}".format(starting_place))
print("Arrival place: {}".format(arrival_place))
Starting place: 14, Via Vittorio Veneto, Clarina, San Pio X, Trento, Territorio Val d'Adige, Provincia di Trento, Trentino-Alto Adige/Südtirol, 38122, Italia Arrival place: Via Don Narciso Sordo, Clarina, San Pio X, Trento, Territorio Val d'Adige, Provincia di Trento, Trentino-Alto Adige/Südtirol, 38122, Italia
Starting and arrival point are quite near. This could be a ring trip, but let's check it out by plotting it.
# Set time as index for movingpandas
for i in range(len(activities_dfList)):
activities_dfList[i].set_index('time', drop=True, inplace=True)
for i in range(len(runs_dfList)):
runs_dfList[i].set_index('time', drop=True, inplace=True)
for i in range(len(rides_dfList)):
rides_dfList[i].set_index('time', drop=True, inplace=True)
for i in range(len(hikes_dfList)):
hikes_dfList[i].set_index('time', drop=True, inplace=True)
# List of geodataframes of activities
geo_dfList = utils.toGdfList(activities_dfList)
runs_geo_dfList = utils.toGdfList(runs_dfList)
rides_geo_dfList = utils.toGdfList(rides_dfList)
hikes_geo_dfList = utils.toGdfList(hikes_dfList)
100%|██████████| 152/152 [00:03<00:00, 41.14it/s] 100%|██████████| 135/135 [00:02<00:00, 50.17it/s] 100%|██████████| 3/3 [00:00<00:00, 47.02it/s] 100%|██████████| 14/14 [00:00<00:00, 48.64it/s]
# List of all trajectories of the dataset
trajectories = utils.getTrajList(geo_dfList)
runsTrajectories = utils.getTrajList(runs_geo_dfList)
ridesTrajectories = utils.getTrajList(rides_geo_dfList)
hikesTrajectories = utils.getTrajList(hikes_geo_dfList)
# Trajectory example
trajectories[91]
Trajectory 91 (2021-04-24 07:43:50 to 2021-04-24 18:10:35) | Size: 3122 | Length: 102216.3m Bounds: (10.849064, 45.850788, 11.12471, 46.085197) LINESTRING Z (11.119156 46.057613 263.4, 11.119101 46.057592 263.2, 11.11886 46.057567 261.4, 11.118
trajectories[91].plot()
plt.show()
trajectories[91].hvplot(geo=True, tiles='EsriImagery', line_width=5, color='lightblue')
So apparently this wasn't a ring trip, but an a/r trip to Garda Lake. Let's investigate more about it.
for i in tqdm(range(len(trajectories))):
trajectories[i].add_speed(overwrite=True)
for i in tqdm(range(len(trajectories))):
trajectories[i].df['kmh'] = trajectories[i].df['speed'].apply(utils.ms_to_km)
100%|██████████| 152/152 [00:59<00:00, 2.54it/s] 100%|██████████| 152/152 [00:00<00:00, 552.20it/s]
trajectories[91].df
| longitude | latitude | elevation | geometry | speed | kmh | |
|---|---|---|---|---|---|---|
| time | ||||||
| 2021-04-24 07:43:50 | 11.119156 | 46.057613 | 263.4 | POINT Z (11.11916 46.05761 263.40000) | 1.618042 | 5.824950 |
| 2021-04-24 07:43:53 | 11.119101 | 46.057592 | 263.2 | POINT Z (11.11910 46.05759 263.20000) | 1.618042 | 5.824950 |
| 2021-04-24 07:44:01 | 11.118860 | 46.057567 | 261.4 | POINT Z (11.11886 46.05757 261.40000) | 2.356896 | 8.484827 |
| 2021-04-24 07:44:08 | 11.118716 | 46.057441 | 248.0 | POINT Z (11.11872 46.05744 248.00000) | 2.556767 | 9.204360 |
| 2021-04-24 07:44:10 | 11.118725 | 46.057395 | 242.8 | POINT Z (11.11872 46.05739 242.80000) | 2.580113 | 9.288407 |
| ... | ... | ... | ... | ... | ... | ... |
| 2021-04-24 18:10:26 | 11.119483 | 46.056766 | 182.4 | POINT Z (11.11948 46.05677 182.40000) | 4.510490 | 16.237765 |
| 2021-04-24 18:10:27 | 11.119526 | 46.056744 | 182.2 | POINT Z (11.11953 46.05674 182.20000) | 4.129420 | 14.865911 |
| 2021-04-24 18:10:30 | 11.119630 | 46.056667 | 182.0 | POINT Z (11.11963 46.05667 182.00000) | 3.916085 | 14.097904 |
| 2021-04-24 18:10:33 | 11.119744 | 46.056621 | 182.0 | POINT Z (11.11974 46.05662 182.00000) | 3.398808 | 12.235708 |
| 2021-04-24 18:10:35 | 11.119790 | 46.056612 | 182.0 | POINT Z (11.11979 46.05661 182.00000) | 1.848786 | 6.655629 |
3122 rows × 6 columns
print("Average pace: {}; max pace: {}".format(np.round(np.average(trajectories[91].df.kmh), 2), np.round(np.max(trajectories[91].df.kmh), 2)))
Average pace: 20.16; max pace: 51.01
20 km/h as average pace isn't so bad, but 51 km/h it's definitely out of my capabilities. Where can I have had that peak?
trajectories[91].hvplot(c='kmh', geo=True, tiles='OSM', cmap='Reds', line_width=5, colorbar=True)
We can see that from Nago to Torbole there is a street portion in which I had a huge pace in a direction, but a really low one in the other direction. Why that?
highSpeed = trajectories[91].df[trajectories[91].df.kmh > 40.0]
highSpeed
| longitude | latitude | elevation | geometry | speed | kmh | |
|---|---|---|---|---|---|---|
| time | ||||||
| 2021-04-24 10:44:55 | 10.885169 | 45.877044 | 188.0 | POINT Z (10.88517 45.87704 188.00000) | 11.440208 | 41.184750 |
| 2021-04-24 10:45:00 | 10.884649 | 45.876551 | 184.0 | POINT Z (10.88465 45.87655 184.00000) | 13.612350 | 49.004460 |
| 2021-04-24 10:45:05 | 10.884096 | 45.876046 | 177.4 | POINT Z (10.88410 45.87605 177.40000) | 14.133370 | 50.880133 |
| 2021-04-24 10:45:10 | 10.883541 | 45.875546 | 169.6 | POINT Z (10.88354 45.87555 169.60000) | 14.064301 | 50.631484 |
| 2021-04-24 10:45:15 | 10.883008 | 45.875038 | 162.8 | POINT Z (10.88301 45.87504 162.80000) | 14.000708 | 50.402549 |
| 2021-04-24 10:45:19 | 10.882568 | 45.874675 | 157.2 | POINT Z (10.88257 45.87468 157.20000) | 13.216540 | 47.579545 |
| 2021-04-24 10:45:23 | 10.882104 | 45.874306 | 151.6 | POINT Z (10.88210 45.87431 151.60000) | 13.647074 | 49.129467 |
| 2021-04-24 10:45:25 | 10.881868 | 45.874115 | 148.8 | POINT Z (10.88187 45.87412 148.80000) | 14.021509 | 50.477434 |
| 2021-04-24 10:45:27 | 10.881623 | 45.873926 | 145.8 | POINT Z (10.88162 45.87393 145.80000) | 14.169672 | 51.010819 |
| 2021-04-24 10:45:30 | 10.881269 | 45.873657 | 141.6 | POINT Z (10.88127 45.87366 141.60000) | 13.537351 | 48.734462 |
| 2021-04-24 10:45:31 | 10.881156 | 45.873570 | 140.2 | POINT Z (10.88116 45.87357 140.20000) | 13.056724 | 47.004207 |
| 2021-04-24 10:45:33 | 10.880917 | 45.873396 | 137.2 | POINT Z (10.88092 45.87340 137.20000) | 13.401048 | 48.243773 |
| 2021-04-24 10:45:35 | 10.880679 | 45.873223 | 134.8 | POINT Z (10.88068 45.87322 134.80000) | 13.334094 | 48.002739 |
| 2021-04-24 10:45:37 | 10.880451 | 45.873061 | 132.6 | POINT Z (10.88045 45.87306 132.60000) | 12.625165 | 45.450593 |
| 2021-04-24 10:45:43 | 10.879825 | 45.872616 | 126.0 | POINT Z (10.87983 45.87262 126.00000) | 11.557422 | 41.606719 |
| 2021-04-24 10:45:46 | 10.879511 | 45.872388 | 122.8 | POINT Z (10.87951 45.87239 122.80000) | 11.721599 | 42.197757 |
| 2021-04-24 10:45:49 | 10.879201 | 45.872153 | 119.6 | POINT Z (10.87920 45.87215 119.60000) | 11.839487 | 42.622153 |
| 2021-04-24 10:45:52 | 10.878904 | 45.871904 | 117.4 | POINT Z (10.87890 45.87190 117.40000) | 12.007906 | 43.228463 |
| 2021-04-24 10:45:58 | 10.878309 | 45.871429 | 110.8 | POINT Z (10.87831 45.87143 110.80000) | 11.692301 | 42.092285 |
| 2021-04-24 10:46:04 | 10.877811 | 45.870949 | 104.0 | POINT Z (10.87781 45.87095 104.00000) | 11.440576 | 41.186072 |
| 2021-04-24 10:46:06 | 10.877681 | 45.870762 | 101.6 | POINT Z (10.87768 45.87076 101.60000) | 11.553034 | 41.590921 |
| 2021-04-24 15:03:47 | 10.908134 | 45.871694 | 243.8 | POINT Z (10.90813 45.87169 243.80000) | 11.178470 | 40.242492 |
Excluding the last point, all the others are sequential, so I passed 40 km/h only in that portion of the route. Not really a professional rider, I agree, but can you guess the reason for that pace peak?
print("Segment length: {} m; segment elevation change: {} m".format(np.round(highSpeed[0:1].to_crs('epsg:32632').geometry[0].distance(highSpeed[-2:].to_crs('epsg:32632').geometry[0]), 3), highSpeed[-2:].elevation[0] - highSpeed[0:1].elevation[0]))
Segment length: 908.454 m; segment elevation change: -86.4 m
That's why: 86m of height difference in less than a km, it's a quite steep descent! But truth be told, I still have clear memories about the ascent on our way back..